Preview of Award 2228092 - annual Project Report

< Back

Cover

Federal Agency and Organization Element to Which Report is Submitted:

4900

Federal Award or Other Identifying Number Assigned by Agency:

2228092

Project Title:

CPS: Small: Learning How to Control: A Meta-Learning Approach for the Adaptive Control of Cyber-Physical Systems

PD/PI Name:

Michael D Lemmon, Principal Investigator

Recipient Organization:

University of Notre Dame

Project/Grant Period:

06/15/2023 - 05/31/2026

Reporting Period:

06/15/2023 - 05/31/2024

Submitting Official (if other than PD\PI):

N/A

Submission Date:

N/A

Signature of Submitting Official (signature shall be submitted in accordance with agency specific instructions)

N/A

Back to the top

Accomplishments

* What are the major goals of the project?

Project Overview: This project is developing methods that "learn how to control" complex cyber-physical systems (CPS) found in Manufacturing 4.0 applications. In this project, the cyber fabric is formed from a network of "digital twins" for the machines on the factory floor and the physical fabric is formed from the physical machines and materials moving across the factory floor. The proposed approach uses a meta-learning framework to generate adversarial jobshop scheduling scenarios for the manufacturing facility. It uses those adversarial scenarios in decentrlized multi-agent reinforcement learning (MARL) to safely manage the network of digital twins. The project is using methods from model-following control and the supervisory control of hybrid systems to "safely" transfer the action policies of the digital twins to the physical machines and materials on the factory floor. The project will evaluate the proposed approach on a multi-robotic testbed emulating the physical flow of materials through the factory.

Major goals for the project are itemized below:

[GOAL 1:] Use meta-learning to identify jobshop scheduling scenarios for factories running in a high-mix low-volume (HMLV) environment. These scenarios will be used to help train digital twins in the cyber-fabric of the manufacturing system.

[GOAL 2:] Develop decentralized multi-agent reinforcement learning algorithms that safely and optimally schedule the actions of digital twins for factory floor machines.

[GOAL 3:] Develop methods for the "safe" transfer of digital twin action policies to the physical factory floor.

[GOAL 4:] Build a multi-robotic testbed emulating the use of digital twins in managing the flow of materials across a factory floor. Demonstrate and evaluate the proposed "learning how to control" approach on this testbed.

* What was accomplished under these goals and objectives (you must provide information for at least one of the 4 categories below)?

Major Activities:: 1) [Activity 1 - Testbed Development:] The project began supporting a new PhD student (Eriana Sobral) in August 2023 to assist with the development of the multi-robot testbed. The robots and networking infrastructure were purchased, delivered, and initially setup in the PI's lab space. Ms. Sobral left the project in December 2023 due to health issues, so work on the testbed was temporarily paused during Spring 2024. Work will recommence in summer 2024. The project has a new student (Roghayeh Rafiei Sangari) coming in August 2024 to begin supporting this part of the project. The outcomes from this activity will support GOAL 4 on testbed development.

2) [ Activity 2 - Learning-enabled control of dynamical systems:] This activity sought to complete the learning-enabled model-following control methods discussed in the original project proposal. That approach was investigated by a Notre Dame B.S. student (Nolan Fey) as part of his BS thesis project. Mr. Fey graduated from Notre Dame in May 2023 and is now a PhD student at MIT. After going to MIT, Mr. Fey completed some of the open tasks in his BS thesis where he used the learning-enabled model-following methods on mini-cheetah quadruped robot. This hardware demonstration was done with the help of his former collaborators at Notre Dame's robotics lab (Li, Adrian, Wensing) using the methods developed by the project PI (Lemmon). Those results (see product [1]) will be reported in the 6th annual learning for dynamics & control conference (L4DC), July 15-17, 2024, University of Oxford, England. This activity supports GOAL 3, in demonstrating the online adaptation of a RL action policy to ensure good performance of the physical (robotic) system.

3) [Activity 3 - Fair Federated Learning:] The original results reported in the project proposal focused primarily on robotic systems (see Activity 2), but this project will extend those methods to networked cyber-physical systems. This activity, therefore, began looking at how federated learning frameworks might be used for distributed learning of controllers. One of the PI's PhD students (Yuying Duan) began supporting this activity by investigating post-processing approaches for federated learning. That work developed methods for minimizing the performance lost when the federated learning framework began enforcing "fair" allocation resource constraints on the system's clients. These results (see product [2]) were reported in a paper that was submitted to the 2024 International Conference on Machine Learning (ICML), July 21-27, 2024, Vienna Austria. While this paper was not accepted for presentation at the conference, it is being developed into a more complete arXiv version that will be resubmitted to NeurIPS. Starting in summer 2024, the PhD student (Yuying Duan) will begin examining how decentralized MARL action policies can be trained in a federated learning framework, with the specific objective of balancing action policy optimality against client fairness. The outcomes from this activity directly support GOAL 2 (MARL for Digital Twins).

4) [Activity 4 - Safe Sim2Real Transfer:]. In Spring 2024, the PI and his student (Yuying Duan) began formally developing a method for the safe transfer of reinforcement learning (RL) action policies obtained in a virtual training environment to an unknown physical testing enviornment. We refer to this transfer between a training and testing environment as Sim2Real transfer. The main outcome from this activity was the formulation of a provably correct framework for Sim2Real transfer based on a finite abstraction developed from the virtual training environment. In particular, we found an algorithmically efficient method for learning an abstraction for the training environment that could then be used to generate reference trajectories that would enforce "safe" operation in the physical test environment using a auxiliary controller that could be learned on-line. The approach is a combination of control barrier function (CBF) [Ames et al, IEEE-TAC, 62(8):3861-3876, 2016] and predictive safety filter (PSF) [Wabersich et al. Automatica, 129:109597, 2021] where the CBF is learned from the training environment's finite abstraction and the PSF is generated using the DMDc methods [Proctor et al., SIAM Journal of Applied Dynamical Systems, 15(1): 142-161, 2016] we used in [Lemmon et al, American Control Conference, 2022] and [Fey et al. L4DC conferencd, 2024]. This expanded framework for Sim2Real Transfer is currently being finalized and initial reports on the outcomes will be prepared over Summer 2024. A new PhD student (Roghayeh Rafiei Sangari) will begin assisting the PI on this activity in August 2024. This activity supports GOAL 3 of the project.
Specific Objectives:: Specific Objectives:

[Objective 1:] Develop hardware multi-robot testbed (GOAL 4)
Status: hardware purchased and initial networking of robots complete. Hiring new student assistants to help with testbed development

[Objective 2:] Develop Meta-learning framework for identifying challenging scenarios used in MARL training of digital twins (GOAL 1):
status: Initial project proposed using meta-learning for robust control where the parameters were the weights on the effort and tracking error of the control objective. Initial empirical studies did not demonstrate that this was a promising use of meta-learning. It became apparent that the game-theoretic nature of meta-learning was much more approrpiate for identifying adversarial scenarios for MARL, so future work will look in this direction.

[Objective 3:] Integration of decentralized MARL with federated learning framework (GOAL 2)

status: initial work on federated learning suggests that post-processing of action policies is an algorithmically efficient way of optimizing digital twin performance subject to fairness constraints. Future work will look explicitly at how a mean-field MARL action policies can be trained in the federated learning framework.

[Objective 4:] Develop an online method for Sim2Real Transfer (GOAL 3):

status: The formal basis for safe Sim2Real transfer were established in Spring 2024 semester. Future work will experimentally study the performance of the approach using the multi-robot testbed as a final proof of concept.
Significant Results:: Nothing to report
Key outcomes or Other achievements:: Nothing to report

* What opportunities for training and professional development has the project provided?

The project has helped support of Notre Dame PhD students (Eriana Sobral, Yuying Duan) and former Notre Dame BS students (Nolan Fey)

* Have the results been disseminated to communities of interest? If so, please provide details.

Results have been disseminated through a conference paper to be presented at

Learning for dynamics and control conference (L4DC), July 15-17, 2024, Oxford, England

and another paper that was submitted to the

International Conference on Machine Learning (ICML), July 21-27, 2024, Vienna, Austria

* What do you plan to do during the next reporting period to accomplish the goals?

Future Plans:

1) Bring in new student (Rogayeh Rafiei Sangari) to assist with robot testbed development (Goal 4) and Safe Sim2Real transfer (Goal 3)

2) work with existing student (Yuying Duan) to develop Federated Learning framework for decentralized MARL (Goal 2)

3) PI has begun working with other robotics experts at Notre Dame (Prof. Yasemin Ozkan-Aydin) regarding the use of decentralized MARL for soft bio-inspired robotics (Goal 2)

4) Over summer 2024, the PI will begin investigating the use of Meta-learning or other generative learning methods for developing adversarial training scenarios. Particular attention will be paid to scenarios found in high-mix low-volume manufacturing environments.

Supporting Files

	Filename	Description	Uploaded By	Uploaded On
(Download)	L4DC_2024.pdf	Fey et al, L4DC, July 2024	Michael Lemmon	05/05/2024
(Download)	7680_post_fair_federated_learning.pdf	Duan et al, ICML 2024 submission	Michael Lemmon	05/05/2024

Back to the top

Products

Books

Book Chapters

Inventions

Journals or Juried Conference Papers

View all journal publications currently available in the NSF Public Access Repository for this award.

The results in the NSF Public Access Repository will include a comprehensive listing of all journal publications recorded to date that are associated with this award.

N. Fey, H. Li, N. Adrian, P. Wensing, and M.D. Lemmon "A learning-based framework to adapt legged robots on-the-fly to unexpected disturbances" To appear in 6th annual learning for dynamics & control conference (L4DC) July 15-17, 2024, Oxford, England. Nolan Fey will present the paper at Oxford in July. Final version of the paper will be published electronically in the Proceedings of Machine Learning Research (PMLR) and will appear on project's website (https://www3.nd.edu/~lemmon/Projects/NSF-21-551/). Status = ACCEPTED.
Y. Duan and M.D. Lemmon "Post-fair federated learning: achieving group and community fairness in federated learning via post-processing" Submitted and under review for International Conference on Machine Learning (ICML) 2024, July 21-27, 2024, Vienna, Austria - not accepted for publication, but review are being used to expand paper. Expanded paper to be submitted for arXiv and also NeurIPS conference. Original version of ICML paper to appear on project website https://www3.nd.edu/~lemmon/Projects/NSF-21-551/. Status = UNDER_REVIEW.

Licenses

Other Conference Presentations / Papers

Other Products

Other Publications

Patent Applications

Technologies or Techniques

Thesis/Dissertations

Websites or Other Internet Sites

Back to the top

Participants/Organizations

What individuals have worked on the project?

Name	Most Senior Project Role	Nearest Person Month Worked
Lemmon, Michael	PD/PI	1
Duan, Yuying	Graduate Student (research assistant)	2
Sobral, Eriana	Graduate Student (research assistant)	6

Full details of individuals who have worked on the project:

Michael Lemmon
Email: lemmon@nd.edu
Most Senior Project Role: PD/PI
Nearest Person Month Worked: 1

Contribution to the Project: project management

Funding Support: 1 month - this project

Change in active other support: No

International Collaboration: No
International Travel: No

Yuying Duan
Email: yduan2@nd.edu
Most Senior Project Role: Graduate Student (research assistant)
Nearest Person Month Worked: 2

Contribution to the Project: Machine Learning expert

Funding Support: 2 months - this project 10 months - departmental teaching assistant

International Collaboration: No
International Travel: No

Eriana Sobral
Email: epinto2@nd.edu
Most Senior Project Role: Graduate Student (research assistant)
Nearest Person Month Worked: 6

Contribution to the Project: Robotic testbed development

Funding Support: 6 months - this project

International Collaboration: No
International Travel: No

What other organizations have been involved as partners?

Nothing to report.

Were other collaborators or contacts involved? If so, please provide details.

Nolan Fey - PhD student at MIT - former ND undergraduate

Prof. Yasemin Ozkan-Aydin - bio-inspired robotics professor in Notre Dame dept. of Electrical Engineering

Back to the top

Impacts

What is the impact on the development of the principal discipline(s) of the project?

Project results on control shields are likely to provide efficient algorithmic approaches for adapting previously learned CPS control strategies to new test environments

What is the impact on other disciplines?

Project results on fair federated learning are likely to have a positive impact on the fair delivery of health care services.

What is the impact on the development of human resources?

Project mentored two graduate students in science and engineering

What was the impact on teaching and educational experiences?

Nothing to report.

What is the impact on physical resources that form infrastructure?

Project added additional infrastructure to Notre Dame lab facilities in the form of two deep learning workstations and 3 mobile robots for use in developing the project's testbed.

What is the impact on institutional resources that form infrastructure?

Nothing to report.

What is the impact on information resources that form infrastructure?

Nothing to report.

What is the impact on technology transfer?

Nothing to report.

What is the impact on society beyond science and technology?

Nothing to report.

What percentage of the award's budget was spent in a foreign country?

0%

Back to the top

Changes/Problems

Changes in approach and reason for change

Nothing to report.

Actual or Anticipated problems or delays and actions or plans to resolve them

The initial student (Eriana Sobral) who was hired to assist with project development had to leave the university due to health issues. This resulted in a 9 month delay in the development of the project's robotic testbed. Another student has been recruited to continue on this part of the project starting in August 2024.

Changes that have a significant impact on expenditures

Nothing to report.

Significant changes in use or care of human subjects

Nothing to report.

Significant changes in use or care of vertebrate animals

Nothing to report.

Significant changes in use or care of biohazards

Nothing to report.

Change in primary performance site location

Nothing to report.

Back to the top

Special Requirements

Responses to any special reporting requirements specified in the award terms and conditions, as well as any award specific reporting requirements.

Nothing to report.

Back to the top

< Back